CoCoS: Fast and Accurate Distributed Triangle Counting in Graph Streams

نویسندگان

چکیده

Given a graph stream, how can we estimate the number of triangles in it using multiple machines with limited storage? Specifically, should edges be processed and sampled across for rapid accurate estimation? The count (i.e., cliques size three) has proven useful numerous applications, including anomaly detection, community link recommendation. For triangle counting large dynamic graphs, recent work focused largely on streaming algorithms distributed but little their combinations “the best both worlds.” In this work, propose CoCoS , fast algorithm estimating counts global all triangles) local incident to each node. Making one pass over input carefully processes stores so that redundant use computational storage resources is minimized. Compared baselines, is: (a) accurate: giving up smaller estimation error; (b) : {10.4\times faster, scaling linearly stream; (c) theoretically sound yielding unbiased estimates.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DiSLR: Distributed Sampling with Limited Redundancy For Triangle Counting in Graph Streams

Given a web-scale graph that grows over time, how should its edges be stored and processed on multiple machines for rapid and accurate estimation of the count of triangles? Œe count of triangles (i.e., cliques of size three) has proven useful in many applications, including anomaly detection, community detection, and link recommendation. For triangle counting in large and dynamic graphs, recent...

متن کامل

FURL: Fixed-memory and Uncertainty Reducing Local Triangle Counting for Graph Streams

How can we accurately estimate local triangles for all nodes in simple and multigraph streams? Local triangle counting in a graph stream is one of the most fundamental tasks in graph mining with important applications including anomaly detection and social network analysis. Although there have been several local triangle counting methods in a graph stream, their estimation has a large variance ...

متن کامل

Continuous Distributed Counting for Non-monotonous Streams

We consider the continual count tracking problem in a distributed environment where the input is anaggregate stream originating from k distinct sites and the updates are allowed to be non-monotonous, i.e. both incre-ments and decrements are allowed. The goal is to continually track the count within a prescribed relative accuracyat the lowest possible communication cost. Specifically...

متن کامل

Fast, accurate call graph profiling

Existing methods of for call graph profiling, such as that used by gprof, deal badly with programs that have shared subroutines, mutual recursion, higher-order functions, or dynamic method binding. This article discusses a way of improving the accuracy of a call graph profile by collecting more information during execution, without significantly increasing the overhead of profiling. The method ...

متن کامل

A second look at counting triangles in graph streams

In this paper we present improved results on the problem of counting triangles in edge streamed graphs. For graphs with m edges and at least T triangles, we show that an extra look over the stream yields a two-pass streaming algorithm that uses O( m ǫ4.5 √ T ) space and outputs a (1 + ǫ) approximation of the number of triangles in the graph. This improves upon the two-pass streaming tester of B...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Knowledge Discovery From Data

سال: 2021

ISSN: ['1556-472X', '1556-4681']

DOI: https://doi.org/10.1145/3441487